Overview

Dataset statistics

Number of variables21
Number of observations10866
Missing cells13434
Missing cells (%)5.9%
Duplicate rows1
Duplicate rows (%)< 0.1%
Total size in memory1.7 MiB
Average record size in memory168.0 B

Variable types

Numeric10
Text10
DateTime1

Alerts

Dataset has 1 (< 0.1%) duplicate rowsDuplicates
id is highly overall correlated with release_yearHigh correlation
popularity is highly overall correlated with budget and 4 other fieldsHigh correlation
budget is highly overall correlated with popularity and 4 other fieldsHigh correlation
revenue is highly overall correlated with popularity and 4 other fieldsHigh correlation
vote_count is highly overall correlated with popularity and 4 other fieldsHigh correlation
release_year is highly overall correlated with idHigh correlation
budget_adj is highly overall correlated with popularity and 4 other fieldsHigh correlation
revenue_adj is highly overall correlated with popularity and 4 other fieldsHigh correlation
homepage has 7930 (73.0%) missing valuesMissing
tagline has 2824 (26.0%) missing valuesMissing
keywords has 1493 (13.7%) missing valuesMissing
production_companies has 1030 (9.5%) missing valuesMissing
budget has 5696 (52.4%) zerosZeros
revenue has 6016 (55.4%) zerosZeros
budget_adj has 5696 (52.4%) zerosZeros
revenue_adj has 6016 (55.4%) zerosZeros

Reproduction

Analysis started2023-10-11 23:31:49.173566
Analysis finished2023-10-11 23:31:56.240453
Duration7.07 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

id
Real number (ℝ)

HIGH CORRELATION 

Distinct10865
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean66064.177
Minimum5
Maximum417859
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size85.0 KiB
2023-10-11T16:31:56.290773image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile1221
Q110596.25
median20669
Q375610
95-th percentile288556
Maximum417859
Range417854
Interquartile range (IQR)65013.75

Descriptive statistics

Standard deviation92130.137
Coefficient of variation (CV)1.3945551
Kurtosis1.781869
Mean66064.177
Median Absolute Deviation (MAD)15121.5
Skewness1.7322939
Sum7.1785335 × 108
Variance8.4879621 × 109
MonotonicityNot monotonic
2023-10-11T16:31:56.361725image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
42194 2
 
< 0.1%
135397 1
 
< 0.1%
9534 1
 
< 0.1%
70476 1
 
< 0.1%
44345 1
 
< 0.1%
16358 1
 
< 0.1%
20304 1
 
< 0.1%
20544 1
 
< 0.1%
18442 1
 
< 0.1%
49870 1
 
< 0.1%
Other values (10855) 10855
99.9%
ValueCountFrequency (%)
5 1
< 0.1%
6 1
< 0.1%
11 1
< 0.1%
12 1
< 0.1%
13 1
< 0.1%
14 1
< 0.1%
16 1
< 0.1%
17 1
< 0.1%
18 1
< 0.1%
20 1
< 0.1%
ValueCountFrequency (%)
417859 1
< 0.1%
414419 1
< 0.1%
409696 1
< 0.1%
395883 1
< 0.1%
395560 1
< 0.1%
386501 1
< 0.1%
382517 1
< 0.1%
378373 1
< 0.1%
376823 1
< 0.1%
374430 1
< 0.1%
Distinct10855
Distinct (%)> 99.9%
Missing10
Missing (%)0.1%
Memory size85.0 KiB
2023-10-11T16:31:56.584364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters97704
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10854 ?
Unique (%)> 99.9%

Sample

1st rowtt0369610
2nd rowtt1392190
3rd rowtt2908446
4th rowtt2488496
5th rowtt2820852
ValueCountFrequency (%)
tt0411951 2
 
< 0.1%
tt3659388 1
 
< 0.1%
tt1798684 1
 
< 0.1%
tt1964418 1
 
< 0.1%
tt1951266 1
 
< 0.1%
tt2908446 1
 
< 0.1%
tt2488496 1
 
< 0.1%
tt2820852 1
 
< 0.1%
tt1663202 1
 
< 0.1%
tt1340138 1
 
< 0.1%
Other values (10845) 10845
99.9%
2023-10-11T16:31:56.883835image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 21712
22.2%
0 15108
15.5%
1 10229
10.5%
2 7559
 
7.7%
3 6667
 
6.8%
4 6546
 
6.7%
8 6328
 
6.5%
7 6087
 
6.2%
9 6077
 
6.2%
6 5822
 
6.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 75992
77.8%
Lowercase Letter 21712
 
22.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 15108
19.9%
1 10229
13.5%
2 7559
9.9%
3 6667
8.8%
4 6546
8.6%
8 6328
8.3%
7 6087
8.0%
9 6077
8.0%
6 5822
 
7.7%
5 5569
 
7.3%
Lowercase Letter
ValueCountFrequency (%)
t 21712
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 75992
77.8%
Latin 21712
 
22.2%

Most frequent character per script

Common
ValueCountFrequency (%)
0 15108
19.9%
1 10229
13.5%
2 7559
9.9%
3 6667
8.8%
4 6546
8.6%
8 6328
8.3%
7 6087
8.0%
9 6077
8.0%
6 5822
 
7.7%
5 5569
 
7.3%
Latin
ValueCountFrequency (%)
t 21712
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 97704
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 21712
22.2%
0 15108
15.5%
1 10229
10.5%
2 7559
 
7.7%
3 6667
 
6.8%
4 6546
 
6.7%
8 6328
 
6.5%
7 6087
 
6.2%
9 6077
 
6.2%
6 5822
 
6.0%

popularity
Real number (ℝ)

HIGH CORRELATION 

Distinct10814
Distinct (%)99.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.64644095
Minimum6.5 × 10-5
Maximum32.985763
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size85.0 KiB
2023-10-11T16:31:56.975046image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum6.5 × 10-5
5-th percentile0.06425225
Q10.20758275
median0.3838555
Q30.713817
95-th percentile2.0466017
Maximum32.985763
Range32.985698
Interquartile range (IQR)0.50623425

Descriptive statistics

Standard deviation1.0001849
Coefficient of variation (CV)1.5472178
Kurtosis210.99813
Mean0.64644095
Median Absolute Deviation (MAD)0.2153785
Skewness9.8763313
Sum7024.2274
Variance1.0003699
MonotonicityNot monotonic
2023-10-11T16:31:57.042601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.028143 2
 
< 0.1%
0.144297 2
 
< 0.1%
0.158021 2
 
< 0.1%
0.430191 2
 
< 0.1%
0.210808 2
 
< 0.1%
0.22758 2
 
< 0.1%
0.623706 2
 
< 0.1%
0.109305 2
 
< 0.1%
0.247926 2
 
< 0.1%
0.326556 2
 
< 0.1%
Other values (10804) 10846
99.8%
ValueCountFrequency (%)
6.5 × 10-51
< 0.1%
0.000188 1
< 0.1%
0.00062 1
< 0.1%
0.000973 1
< 0.1%
0.001115 1
< 0.1%
0.001117 1
< 0.1%
0.001315 1
< 0.1%
0.001317 1
< 0.1%
0.001349 1
< 0.1%
0.001372 1
< 0.1%
ValueCountFrequency (%)
32.985763 1
< 0.1%
28.419936 1
< 0.1%
24.949134 1
< 0.1%
14.311205 1
< 0.1%
13.112507 1
< 0.1%
12.971027 1
< 0.1%
12.037933 1
< 0.1%
11.422751 1
< 0.1%
11.173104 1
< 0.1%
10.739009 1
< 0.1%

budget
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct557
Distinct (%)5.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14625701
Minimum0
Maximum4.25 × 108
Zeros5696
Zeros (%)52.4%
Negative0
Negative (%)0.0%
Memory size85.0 KiB
2023-10-11T16:31:57.110229image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q315000000
95-th percentile75000000
Maximum4.25 × 108
Range4.25 × 108
Interquartile range (IQR)15000000

Descriptive statistics

Standard deviation30913214
Coefficient of variation (CV)2.1136227
Kurtosis19.269436
Mean14625701
Median Absolute Deviation (MAD)0
Skewness3.7172371
Sum1.5892287 × 1011
Variance9.5562679 × 1014
MonotonicityNot monotonic
2023-10-11T16:31:57.177652image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 5696
52.4%
20000000 190
 
1.7%
15000000 183
 
1.7%
25000000 178
 
1.6%
10000000 176
 
1.6%
30000000 165
 
1.5%
5000000 141
 
1.3%
40000000 134
 
1.2%
35000000 128
 
1.2%
12000000 120
 
1.1%
Other values (547) 3755
34.6%
ValueCountFrequency (%)
0 5696
52.4%
1 4
 
< 0.1%
2 1
 
< 0.1%
3 3
 
< 0.1%
5 1
 
< 0.1%
6 1
 
< 0.1%
8 3
 
< 0.1%
10 6
 
0.1%
11 1
 
< 0.1%
12 2
 
< 0.1%
ValueCountFrequency (%)
425000000 1
 
< 0.1%
380000000 1
 
< 0.1%
300000000 1
 
< 0.1%
280000000 1
 
< 0.1%
270000000 1
 
< 0.1%
260000000 2
 
< 0.1%
258000000 1
 
< 0.1%
255000000 1
 
< 0.1%
250000000 7
0.1%
245000000 1
 
< 0.1%

revenue
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct4702
Distinct (%)43.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean39823320
Minimum0
Maximum2.7815058 × 109
Zeros6016
Zeros (%)55.4%
Negative0
Negative (%)0.0%
Memory size85.0 KiB
2023-10-11T16:31:57.240050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q324000000
95-th percentile2.1367216 × 108
Maximum2.7815058 × 109
Range2.7815058 × 109
Interquartile range (IQR)24000000

Descriptive statistics

Standard deviation1.1700349 × 108
Coefficient of variation (CV)2.9380646
Kurtosis73.168489
Mean39823320
Median Absolute Deviation (MAD)0
Skewness6.6583972
Sum4.3272019 × 1011
Variance1.3689816 × 1016
MonotonicityNot monotonic
2023-10-11T16:31:57.307321image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 6016
55.4%
12000000 10
 
0.1%
10000000 8
 
0.1%
11000000 7
 
0.1%
6000000 6
 
0.1%
2000000 6
 
0.1%
5000000 6
 
0.1%
30000000 5
 
< 0.1%
20000000 5
 
< 0.1%
14000000 5
 
< 0.1%
Other values (4692) 4792
44.1%
ValueCountFrequency (%)
0 6016
55.4%
2 2
 
< 0.1%
3 3
 
< 0.1%
5 2
 
< 0.1%
6 2
 
< 0.1%
9 2
 
< 0.1%
10 1
 
< 0.1%
11 3
 
< 0.1%
12 1
 
< 0.1%
13 2
 
< 0.1%
ValueCountFrequency (%)
2781505847 1
< 0.1%
2068178225 1
< 0.1%
1845034188 1
< 0.1%
1519557910 1
< 0.1%
1513528810 1
< 0.1%
1506249360 1
< 0.1%
1405035767 1
< 0.1%
1327817822 1
< 0.1%
1274219009 1
< 0.1%
1215439994 1
< 0.1%
Distinct10571
Distinct (%)97.3%
Missing0
Missing (%)0.0%
Memory size85.0 KiB
2023-10-11T16:31:57.522211image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length104
Median length70
Mean length16.002209
Min length1

Characters and Unicode

Total characters173880
Distinct characters164
Distinct categories19 ?
Distinct scripts2 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10294 ?
Unique (%)94.7%

Sample

1st rowJurassic World
2nd rowMad Max: Fury Road
3rd rowInsurgent
4th rowStar Wars: The Force Awakens
5th rowFurious 7
ValueCountFrequency (%)
the 3279
 
10.6%
of 969
 
3.1%
a 386
 
1.2%
in 327
 
1.1%
and 317
 
1.0%
to 227
 
0.7%
2 226
 
0.7%
211
 
0.7%
man 148
 
0.5%
for 113
 
0.4%
Other values (8859) 24843
80.0%
2023-10-11T16:31:57.934104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
20178
 
11.6%
e 17617
 
10.1%
a 10758
 
6.2%
o 10239
 
5.9%
r 9356
 
5.4%
n 9343
 
5.4%
i 9062
 
5.2%
t 8414
 
4.8%
s 6857
 
3.9%
h 6463
 
3.7%
Other values (154) 65593
37.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 122010
70.2%
Uppercase Letter 27184
 
15.6%
Space Separator 20195
 
11.6%
Other Punctuation 2612
 
1.5%
Decimal Number 1140
 
0.7%
Dash Punctuation 212
 
0.1%
Modifier Symbol 114
 
0.1%
Other Symbol 101
 
0.1%
Currency Symbol 78
 
< 0.1%
Other Number 71
 
< 0.1%
Other values (9) 163
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 17617
14.4%
a 10758
 
8.8%
o 10239
 
8.4%
r 9356
 
7.7%
n 9343
 
7.7%
i 9062
 
7.4%
t 8414
 
6.9%
s 6857
 
5.6%
h 6463
 
5.3%
l 5845
 
4.8%
Other values (35) 28056
23.0%
Uppercase Letter
ValueCountFrequency (%)
T 3616
 
13.3%
S 2228
 
8.2%
B 1815
 
6.7%
M 1783
 
6.6%
C 1613
 
5.9%
D 1579
 
5.8%
A 1541
 
5.7%
L 1281
 
4.7%
H 1244
 
4.6%
P 1197
 
4.4%
Other values (27) 9287
34.2%
Other Punctuation
ValueCountFrequency (%)
: 1097
42.0%
' 560
21.4%
. 368
 
14.1%
, 154
 
5.9%
& 153
 
5.9%
! 123
 
4.7%
? 41
 
1.6%
/ 29
 
1.1%
¡ 14
 
0.5%
12
 
0.5%
Other values (14) 61
 
2.3%
Decimal Number
ValueCountFrequency (%)
2 327
28.7%
3 172
15.1%
1 172
15.1%
0 167
14.6%
4 81
 
7.1%
5 67
 
5.9%
9 48
 
4.2%
7 42
 
3.7%
6 35
 
3.1%
8 29
 
2.5%
Currency Symbol
ValueCountFrequency (%)
29
37.2%
¤ 18
23.1%
¢ 12
15.4%
£ 10
 
12.8%
¥ 5
 
6.4%
$ 4
 
5.1%
Other Number
ValueCountFrequency (%)
¹ 22
31.0%
³ 14
19.7%
¼ 14
19.7%
½ 9
12.7%
² 7
 
9.9%
¾ 5
 
7.0%
Modifier Symbol
ValueCountFrequency (%)
¸ 63
55.3%
¨ 24
 
21.1%
´ 14
 
12.3%
˜ 9
 
7.9%
¯ 4
 
3.5%
Other Symbol
ValueCountFrequency (%)
© 60
59.4%
20
 
19.8%
° 11
 
10.9%
¦ 7
 
6.9%
® 3
 
3.0%
Math Symbol
ValueCountFrequency (%)
± 9
36.0%
¬ 8
32.0%
× 5
20.0%
+ 3
 
12.0%
Final Punctuation
ValueCountFrequency (%)
» 8
38.1%
7
33.3%
3
 
14.3%
3
 
14.3%
Initial Punctuation
ValueCountFrequency (%)
7
29.2%
7
29.2%
« 5
20.8%
5
20.8%
Dash Punctuation
ValueCountFrequency (%)
- 206
97.2%
5
 
2.4%
1
 
0.5%
Open Punctuation
ValueCountFrequency (%)
( 16
43.2%
14
37.8%
7
18.9%
Space Separator
ValueCountFrequency (%)
20178
99.9%
  17
 
0.1%
Other Letter
ValueCountFrequency (%)
ª 13
61.9%
º 8
38.1%
Close Punctuation
ValueCountFrequency (%)
) 16
100.0%
Modifier Letter
ValueCountFrequency (%)
ˆ 11
100.0%
Format
ValueCountFrequency (%)
­ 6
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 149210
85.8%
Common 24670
 
14.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 17617
 
11.8%
a 10758
 
7.2%
o 10239
 
6.9%
r 9356
 
6.3%
n 9343
 
6.3%
i 9062
 
6.1%
t 8414
 
5.6%
s 6857
 
4.6%
h 6463
 
4.3%
l 5845
 
3.9%
Other values (73) 55256
37.0%
Common
ValueCountFrequency (%)
20178
81.8%
: 1097
 
4.4%
' 560
 
2.3%
. 368
 
1.5%
2 327
 
1.3%
- 206
 
0.8%
3 172
 
0.7%
1 172
 
0.7%
0 167
 
0.7%
, 154
 
0.6%
Other values (71) 1269
 
5.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 172797
99.4%
None 915
 
0.5%
Punctuation 99
 
0.1%
Currency Symbols 29
 
< 0.1%
Letterlike Symbols 20
 
< 0.1%
Modifier Letters 20
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
20178
 
11.7%
e 17617
 
10.2%
a 10758
 
6.2%
o 10239
 
5.9%
r 9356
 
5.4%
n 9343
 
5.4%
i 9062
 
5.2%
t 8414
 
4.9%
s 6857
 
4.0%
h 6463
 
3.7%
Other values (73) 64510
37.3%
None
ValueCountFrequency (%)
à 162
 
17.7%
¸ 63
 
6.9%
© 60
 
6.6%
à 50
 
5.5%
ì 31
 
3.4%
¨ 24
 
2.6%
¹ 22
 
2.4%
å 21
 
2.3%
¤ 18
 
2.0%
ã 17
 
1.9%
Other values (52) 447
48.9%
Currency Symbols
ValueCountFrequency (%)
29
100.0%
Letterlike Symbols
ValueCountFrequency (%)
20
100.0%
Punctuation
ValueCountFrequency (%)
14
14.1%
12
12.1%
11
11.1%
8
8.1%
7
7.1%
7
7.1%
7
7.1%
7
7.1%
7
7.1%
5
 
5.1%
Other values (5) 14
14.1%
Modifier Letters
ValueCountFrequency (%)
ˆ 11
55.0%
˜ 9
45.0%

cast
Text

Distinct10719
Distinct (%)99.3%
Missing76
Missing (%)0.7%
Memory size85.0 KiB
2023-10-11T16:31:58.210284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length110
Median length94
Mean length67.872567
Min length7

Characters and Unicode

Total characters732345
Distinct characters126
Distinct categories16 ?
Distinct scripts2 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10666 ?
Unique (%)98.9%

Sample

1st rowChris Pratt|Bryce Dallas Howard|Irrfan Khan|Vincent D'Onofrio|Nick Robinson
2nd rowTom Hardy|Charlize Theron|Hugh Keays-Byrne|Nicholas Hoult|Josh Helman
3rd rowShailene Woodley|Theo James|Kate Winslet|Ansel Elgort|Miles Teller
4th rowHarrison Ford|Mark Hamill|Carrie Fisher|Adam Driver|Daisy Ridley
5th rowVin Diesel|Paul Walker|Jason Statham|Michelle Rodriguez|Dwayne Johnson
ValueCountFrequency (%)
michael 285
 
0.4%
john 231
 
0.3%
de 209
 
0.3%
james 177
 
0.3%
robert 158
 
0.2%
tom 154
 
0.2%
lee 139
 
0.2%
jason 132
 
0.2%
van 131
 
0.2%
david 128
 
0.2%
Other values (46644) 64967
97.4%
2023-10-11T16:31:58.564384image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 65672
 
9.0%
a 61498
 
8.4%
55922
 
7.6%
n 50507
 
6.9%
r 44415
 
6.1%
i 42812
 
5.8%
| 41783
 
5.7%
o 37725
 
5.2%
l 35122
 
4.8%
t 24677
 
3.4%
Other values (116) 272212
37.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 517877
70.7%
Uppercase Letter 112598
 
15.4%
Space Separator 55935
 
7.6%
Math Symbol 41835
 
5.7%
Other Punctuation 2123
 
0.3%
Dash Punctuation 811
 
0.1%
Other Symbol 591
 
0.1%
Format 130
 
< 0.1%
Modifier Symbol 119
 
< 0.1%
Other Number 107
 
< 0.1%
Other values (6) 219
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 65672
12.7%
a 61498
11.9%
n 50507
9.8%
r 44415
 
8.6%
i 42812
 
8.3%
o 37725
 
7.3%
l 35122
 
6.8%
t 24677
 
4.8%
s 24214
 
4.7%
h 19623
 
3.8%
Other values (24) 111612
21.6%
Uppercase Letter
ValueCountFrequency (%)
M 9756
 
8.7%
J 9025
 
8.0%
S 8852
 
7.9%
C 8638
 
7.7%
B 8209
 
7.3%
D 7012
 
6.2%
R 6503
 
5.8%
A 6314
 
5.6%
L 5342
 
4.7%
H 5002
 
4.4%
Other values (24) 37945
33.7%
Other Punctuation
ValueCountFrequency (%)
. 1300
61.2%
' 481
 
22.7%
¡ 147
 
6.9%
, 49
 
2.3%
43
 
2.0%
§ 36
 
1.7%
28
 
1.3%
16
 
0.8%
10
 
0.5%
" 6
 
0.3%
Other values (5) 7
 
0.3%
Other Number
ValueCountFrequency (%)
³ 50
46.7%
¼ 49
45.8%
² 4
 
3.7%
¾ 2
 
1.9%
¹ 1
 
0.9%
½ 1
 
0.9%
Other Symbol
ValueCountFrequency (%)
© 567
95.9%
11
 
1.9%
® 9
 
1.5%
° 3
 
0.5%
¦ 1
 
0.2%
Modifier Symbol
ValueCountFrequency (%)
¨ 53
44.5%
¯ 28
23.5%
¸ 24
20.2%
´ 13
 
10.9%
˜ 1
 
0.8%
Currency Symbol
ValueCountFrequency (%)
¥ 39
54.2%
¤ 20
27.8%
£ 6
 
8.3%
4
 
5.6%
¢ 3
 
4.2%
Initial Punctuation
ValueCountFrequency (%)
« 56
87.5%
6
 
9.4%
1
 
1.6%
1
 
1.6%
Math Symbol
ValueCountFrequency (%)
| 41783
99.9%
± 50
 
0.1%
¬ 2
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 804
99.1%
6
 
0.7%
1
 
0.1%
Decimal Number
ValueCountFrequency (%)
0 12
48.0%
5 12
48.0%
2 1
 
4.0%
Space Separator
ValueCountFrequency (%)
55922
> 99.9%
  13
 
< 0.1%
Other Letter
ValueCountFrequency (%)
º 30
93.8%
ª 2
 
6.2%
Final Punctuation
ValueCountFrequency (%)
» 11
73.3%
4
 
26.7%
Open Punctuation
ValueCountFrequency (%)
7
63.6%
4
36.4%
Format
ValueCountFrequency (%)
­ 130
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 630506
86.1%
Common 101839
 
13.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 65672
 
10.4%
a 61498
 
9.8%
n 50507
 
8.0%
r 44415
 
7.0%
i 42812
 
6.8%
o 37725
 
6.0%
l 35122
 
5.6%
t 24677
 
3.9%
s 24214
 
3.8%
h 19623
 
3.1%
Other values (59) 224241
35.6%
Common
ValueCountFrequency (%)
55922
54.9%
| 41783
41.0%
. 1300
 
1.3%
- 804
 
0.8%
© 567
 
0.6%
' 481
 
0.5%
¡ 147
 
0.1%
­ 130
 
0.1%
« 56
 
0.1%
¨ 53
 
0.1%
Other values (47) 596
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 729305
99.6%
None 2939
 
0.4%
Punctuation 85
 
< 0.1%
Letterlike Symbols 11
 
< 0.1%
Currency Symbols 4
 
< 0.1%
Modifier Letters 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 65672
 
9.0%
a 61498
 
8.4%
55922
 
7.7%
n 50507
 
6.9%
r 44415
 
6.1%
i 42812
 
5.9%
| 41783
 
5.7%
o 37725
 
5.2%
l 35122
 
4.8%
t 24677
 
3.4%
Other values (54) 269172
36.9%
None
ValueCountFrequency (%)
à 1409
47.9%
© 567
19.3%
¡ 147
 
5.0%
­ 130
 
4.4%
« 56
 
1.9%
¨ 53
 
1.8%
± 50
 
1.7%
³ 50
 
1.7%
¼ 49
 
1.7%
Ä 48
 
1.6%
Other values (37) 380
 
12.9%
Punctuation
ValueCountFrequency (%)
28
32.9%
16
18.8%
10
 
11.8%
7
 
8.2%
6
 
7.1%
6
 
7.1%
4
 
4.7%
4
 
4.7%
1
 
1.2%
1
 
1.2%
Other values (2) 2
 
2.4%
Letterlike Symbols
ValueCountFrequency (%)
11
100.0%
Currency Symbols
ValueCountFrequency (%)
4
100.0%
Modifier Letters
ValueCountFrequency (%)
˜ 1
100.0%

homepage
Text

MISSING 

Distinct2896
Distinct (%)98.6%
Missing7930
Missing (%)73.0%
Memory size85.0 KiB
2023-10-11T16:31:58.883459image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length242
Median length89
Mean length37.146117
Min length13

Characters and Unicode

Total characters109061
Distinct characters83
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2868 ?
Unique (%)97.7%

Sample

1st rowhttp://www.jurassicworld.com/
2nd rowhttp://www.madmaxmovie.com/
3rd rowhttp://www.thedivergentseries.movie/#insurgent
4th rowhttp://www.starwars.com/films/star-wars-episode-vii
5th rowhttp://www.furious7.com/
ValueCountFrequency (%)
http://www.missionimpossible.com 5
 
0.2%
http://www.thehungergames.movie 4
 
0.1%
http://www.transformersmovie.com 4
 
0.1%
http://phantasm.com 4
 
0.1%
http://www.kungfupanda.com 4
 
0.1%
http://www.georgecarlin.com 3
 
0.1%
http://www.lordoftherings.net 3
 
0.1%
http://www.americanreunionmovie.com 3
 
0.1%
http://www.thehobbit.com 3
 
0.1%
http://www.jeffdunham.com 3
 
0.1%
Other values (2878) 2900
98.8%
2023-10-11T16:31:59.278035image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 9865
 
9.0%
/ 9843
 
9.0%
e 7545
 
6.9%
w 7528
 
6.9%
o 7522
 
6.9%
m 6035
 
5.5%
. 5844
 
5.4%
h 5380
 
4.9%
i 5336
 
4.9%
c 4475
 
4.1%
Other values (73) 39688
36.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 86844
79.6%
Other Punctuation 18787
 
17.2%
Dash Punctuation 1265
 
1.2%
Decimal Number 1247
 
1.1%
Uppercase Letter 648
 
0.6%
Connector Punctuation 176
 
0.2%
Math Symbol 81
 
0.1%
Open Punctuation 6
 
< 0.1%
Close Punctuation 6
 
< 0.1%
Currency Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 9865
 
11.4%
e 7545
 
8.7%
w 7528
 
8.7%
o 7522
 
8.7%
m 6035
 
6.9%
h 5380
 
6.2%
i 5336
 
6.1%
c 4475
 
5.2%
p 4174
 
4.8%
a 3960
 
4.6%
Other values (17) 25024
28.8%
Uppercase Letter
ValueCountFrequency (%)
T 61
 
9.4%
S 59
 
9.1%
M 54
 
8.3%
A 46
 
7.1%
E 40
 
6.2%
F 36
 
5.6%
B 35
 
5.4%
D 34
 
5.2%
H 28
 
4.3%
C 27
 
4.2%
Other values (17) 228
35.2%
Other Punctuation
ValueCountFrequency (%)
/ 9843
52.4%
. 5844
31.1%
: 2941
 
15.7%
? 59
 
0.3%
# 40
 
0.2%
% 26
 
0.1%
& 18
 
0.1%
! 8
 
< 0.1%
, 6
 
< 0.1%
' 2
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
2 240
19.2%
0 199
16.0%
1 196
15.7%
3 141
11.3%
4 91
 
7.3%
8 82
 
6.6%
9 78
 
6.3%
7 75
 
6.0%
5 74
 
5.9%
6 71
 
5.7%
Math Symbol
ValueCountFrequency (%)
= 76
93.8%
+ 5
 
6.2%
Open Punctuation
ValueCountFrequency (%)
( 5
83.3%
{ 1
 
16.7%
Close Punctuation
ValueCountFrequency (%)
) 5
83.3%
} 1
 
16.7%
Dash Punctuation
ValueCountFrequency (%)
- 1265
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 176
100.0%
Currency Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 87492
80.2%
Common 21569
 
19.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 9865
 
11.3%
e 7545
 
8.6%
w 7528
 
8.6%
o 7522
 
8.6%
m 6035
 
6.9%
h 5380
 
6.1%
i 5336
 
6.1%
c 4475
 
5.1%
p 4174
 
4.8%
a 3960
 
4.5%
Other values (44) 25672
29.3%
Common
ValueCountFrequency (%)
/ 9843
45.6%
. 5844
27.1%
: 2941
 
13.6%
- 1265
 
5.9%
2 240
 
1.1%
0 199
 
0.9%
1 196
 
0.9%
_ 176
 
0.8%
3 141
 
0.7%
4 91
 
0.4%
Other values (19) 633
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 109058
> 99.9%
None 2
 
< 0.1%
Currency Symbols 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 9865
 
9.0%
/ 9843
 
9.0%
e 7545
 
6.9%
w 7528
 
6.9%
o 7522
 
6.9%
m 6035
 
5.5%
. 5844
 
5.4%
h 5380
 
4.9%
i 5336
 
4.9%
c 4475
 
4.1%
Other values (70) 39685
36.4%
None
ValueCountFrequency (%)
â 1
50.0%
Ž 1
50.0%
Currency Symbols
ValueCountFrequency (%)
1
100.0%
Distinct5067
Distinct (%)46.8%
Missing44
Missing (%)0.4%
Memory size85.0 KiB
2023-10-11T16:31:59.542101image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length533
Median length169
Mean length14.558122
Min length2

Characters and Unicode

Total characters157548
Distinct characters96
Distinct categories18 ?
Distinct scripts2 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3217 ?
Unique (%)29.7%

Sample

1st rowColin Trevorrow
2nd rowGeorge Miller
3rd rowRobert Schwentke
4th rowJ.J. Abrams
5th rowJames Wan
ValueCountFrequency (%)
john 436
 
1.8%
michael 308
 
1.3%
david 301
 
1.3%
robert 212
 
0.9%
peter 201
 
0.8%
james 162
 
0.7%
richard 159
 
0.7%
paul 144
 
0.6%
mark 110
 
0.5%
lee 107
 
0.5%
Other values (6202) 21600
91.0%
2023-10-11T16:31:59.884650image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 14631
 
9.3%
12934
 
8.2%
a 12669
 
8.0%
n 11046
 
7.0%
r 10688
 
6.8%
o 9160
 
5.8%
i 9108
 
5.8%
l 7511
 
4.8%
t 5475
 
3.5%
s 5207
 
3.3%
Other values (86) 59119
37.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 116535
74.0%
Uppercase Letter 25718
 
16.3%
Space Separator 12936
 
8.2%
Math Symbol 1088
 
0.7%
Other Punctuation 839
 
0.5%
Dash Punctuation 178
 
0.1%
Other Symbol 118
 
0.1%
Format 35
 
< 0.1%
Other Number 32
 
< 0.1%
Modifier Symbol 28
 
< 0.1%
Other values (8) 41
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 2357
 
9.2%
J 2208
 
8.6%
M 2207
 
8.6%
R 1800
 
7.0%
B 1686
 
6.6%
C 1612
 
6.3%
D 1484
 
5.8%
A 1433
 
5.6%
G 1264
 
4.9%
L 1223
 
4.8%
Other values (20) 8444
32.8%
Lowercase Letter
ValueCountFrequency (%)
e 14631
12.6%
a 12669
10.9%
n 11046
9.5%
r 10688
9.2%
o 9160
 
7.9%
i 9108
 
7.8%
l 7511
 
6.4%
t 5475
 
4.7%
s 5207
 
4.5%
h 4545
 
3.9%
Other values (16) 26495
22.7%
Other Punctuation
ValueCountFrequency (%)
. 645
76.9%
¡ 65
 
7.7%
' 52
 
6.2%
30
 
3.6%
, 18
 
2.1%
§ 15
 
1.8%
5
 
0.6%
5
 
0.6%
4
 
0.5%
Modifier Symbol
ValueCountFrequency (%)
¨ 12
42.9%
´ 9
32.1%
¸ 5
17.9%
¯ 2
 
7.1%
Math Symbol
ValueCountFrequency (%)
| 1070
98.3%
± 17
 
1.6%
¬ 1
 
0.1%
Other Symbol
ValueCountFrequency (%)
© 112
94.9%
¦ 5
 
4.2%
1
 
0.8%
Currency Symbol
ValueCountFrequency (%)
¥ 11
57.9%
¤ 7
36.8%
1
 
5.3%
Space Separator
ValueCountFrequency (%)
12934
> 99.9%
  2
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 177
99.4%
1
 
0.6%
Other Number
ValueCountFrequency (%)
³ 27
84.4%
¼ 5
 
15.6%
Open Punctuation
ValueCountFrequency (%)
4
80.0%
( 1
 
20.0%
Other Letter
ValueCountFrequency (%)
º 3
75.0%
ª 1
 
25.0%
Initial Punctuation
ValueCountFrequency (%)
« 3
75.0%
1
 
25.0%
Final Punctuation
ValueCountFrequency (%)
» 3
75.0%
1
 
25.0%
Format
ValueCountFrequency (%)
­ 35
100.0%
Control
ValueCountFrequency (%)
3
100.0%
Decimal Number
ValueCountFrequency (%)
9 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 142257
90.3%
Common 15291
 
9.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 14631
 
10.3%
a 12669
 
8.9%
n 11046
 
7.8%
r 10688
 
7.5%
o 9160
 
6.4%
i 9108
 
6.4%
l 7511
 
5.3%
t 5475
 
3.8%
s 5207
 
3.7%
h 4545
 
3.2%
Other values (48) 52217
36.7%
Common
ValueCountFrequency (%)
12934
84.6%
| 1070
 
7.0%
. 645
 
4.2%
- 177
 
1.2%
© 112
 
0.7%
¡ 65
 
0.4%
' 52
 
0.3%
­ 35
 
0.2%
30
 
0.2%
³ 27
 
0.2%
Other values (28) 144
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 156746
99.5%
None 779
 
0.5%
Punctuation 21
 
< 0.1%
Letterlike Symbols 1
 
< 0.1%
Currency Symbols 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 14631
 
9.3%
12934
 
8.3%
a 12669
 
8.1%
n 11046
 
7.0%
r 10688
 
6.8%
o 9160
 
5.8%
i 9108
 
5.8%
l 7511
 
4.8%
t 5475
 
3.5%
s 5207
 
3.3%
Other values (52) 58317
37.2%
None
ValueCountFrequency (%)
à 377
48.4%
© 112
 
14.4%
¡ 65
 
8.3%
­ 35
 
4.5%
30
 
3.9%
³ 27
 
3.5%
± 17
 
2.2%
Å 15
 
1.9%
§ 15
 
1.9%
Ä 14
 
1.8%
Other values (15) 72
 
9.2%
Punctuation
ValueCountFrequency (%)
5
23.8%
5
23.8%
4
19.0%
4
19.0%
1
 
4.8%
1
 
4.8%
1
 
4.8%
Letterlike Symbols
ValueCountFrequency (%)
1
100.0%
Currency Symbols
ValueCountFrequency (%)
1
100.0%

tagline
Text

MISSING 

Distinct7997
Distinct (%)99.4%
Missing2824
Missing (%)26.0%
Memory size85.0 KiB
2023-10-11T16:32:00.117787image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length286
Median length166
Mean length44.174459
Min length1

Characters and Unicode

Total characters355251
Distinct characters126
Distinct categories16 ?
Distinct scripts2 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7957 ?
Unique (%)98.9%

Sample

1st rowThe park is open.
2nd rowWhat a Lovely Day.
3rd rowOne Choice Can Destroy You
4th rowEvery generation has a story.
5th rowVengeance Hits Home
ValueCountFrequency (%)
the 3989
 
6.1%
a 2410
 
3.7%
to 1460
 
2.2%
of 1335
 
2.0%
is 1247
 
1.9%
you 1095
 
1.7%
in 951
 
1.5%
and 804
 
1.2%
one 619
 
0.9%
it 611
 
0.9%
Other values (7019) 50857
77.8%
2023-10-11T16:32:00.427282image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
57392
16.2%
e 36795
 
10.4%
t 22060
 
6.2%
o 21674
 
6.1%
a 18810
 
5.3%
n 17991
 
5.1%
i 17300
 
4.9%
r 16801
 
4.7%
s 16059
 
4.5%
h 14126
 
4.0%
Other values (116) 116243
32.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 258603
72.8%
Space Separator 57393
 
16.2%
Uppercase Letter 21147
 
6.0%
Other Punctuation 16563
 
4.7%
Decimal Number 950
 
0.3%
Dash Punctuation 364
 
0.1%
Currency Symbol 96
 
< 0.1%
Other Symbol 83
 
< 0.1%
Open Punctuation 17
 
< 0.1%
Close Punctuation 15
 
< 0.1%
Other values (6) 20
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 36795
14.2%
t 22060
 
8.5%
o 21674
 
8.4%
a 18810
 
7.3%
n 17991
 
7.0%
i 17300
 
6.7%
r 16801
 
6.5%
s 16059
 
6.2%
h 14126
 
5.5%
l 11024
 
4.3%
Other values (26) 65963
25.5%
Uppercase Letter
ValueCountFrequency (%)
T 3179
15.0%
A 1962
 
9.3%
S 1542
 
7.3%
H 1327
 
6.3%
I 1262
 
6.0%
W 1211
 
5.7%
B 1038
 
4.9%
F 899
 
4.3%
N 896
 
4.2%
E 889
 
4.2%
Other values (25) 6942
32.8%
Other Punctuation
ValueCountFrequency (%)
. 10962
66.2%
' 2278
 
13.8%
, 1587
 
9.6%
! 1041
 
6.3%
? 481
 
2.9%
" 78
 
0.5%
: 40
 
0.2%
* 25
 
0.2%
& 22
 
0.1%
# 16
 
0.1%
Other values (8) 33
 
0.2%
Decimal Number
ValueCountFrequency (%)
0 266
28.0%
1 183
19.3%
2 109
11.5%
3 80
 
8.4%
9 76
 
8.0%
4 54
 
5.7%
5 54
 
5.7%
6 49
 
5.2%
7 45
 
4.7%
8 34
 
3.6%
Open Punctuation
ValueCountFrequency (%)
( 13
76.5%
[ 2
 
11.8%
1
 
5.9%
1
 
5.9%
Modifier Symbol
ValueCountFrequency (%)
˜ 1
25.0%
¯ 1
25.0%
´ 1
25.0%
` 1
25.0%
Currency Symbol
ValueCountFrequency (%)
88
91.7%
$ 7
 
7.3%
¤ 1
 
1.0%
Other Symbol
ValueCountFrequency (%)
48
57.8%
¦ 34
41.0%
© 1
 
1.2%
Math Symbol
ValueCountFrequency (%)
= 3
42.9%
+ 2
28.6%
| 2
28.6%
Space Separator
ValueCountFrequency (%)
57392
> 99.9%
  1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 13
86.7%
] 2
 
13.3%
Other Number
ValueCountFrequency (%)
½ 2
66.7%
² 1
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 364
100.0%
Modifier Letter
ValueCountFrequency (%)
ˆ 3
100.0%
Initial Punctuation
ValueCountFrequency (%)
2
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 279750
78.7%
Common 75501
 
21.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 36795
13.2%
t 22060
 
7.9%
o 21674
 
7.7%
a 18810
 
6.7%
n 17991
 
6.4%
i 17300
 
6.2%
r 16801
 
6.0%
s 16059
 
5.7%
h 14126
 
5.0%
l 11024
 
3.9%
Other values (61) 87110
31.1%
Common
ValueCountFrequency (%)
57392
76.0%
. 10962
 
14.5%
' 2278
 
3.0%
, 1587
 
2.1%
! 1041
 
1.4%
? 481
 
0.6%
- 364
 
0.5%
0 266
 
0.4%
1 183
 
0.2%
2 109
 
0.1%
Other values (45) 838
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 354937
99.9%
None 166
 
< 0.1%
Currency Symbols 88
 
< 0.1%
Letterlike Symbols 48
 
< 0.1%
Punctuation 8
 
< 0.1%
Modifier Letters 4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
57392
16.2%
e 36795
 
10.4%
t 22060
 
6.2%
o 21674
 
6.1%
a 18810
 
5.3%
n 17991
 
5.1%
i 17300
 
4.9%
r 16801
 
4.7%
s 16059
 
4.5%
h 14126
 
4.0%
Other values (78) 115929
32.7%
Currency Symbols
ValueCountFrequency (%)
88
100.0%
None
ValueCountFrequency (%)
â 86
51.8%
¦ 34
 
20.5%
 5
 
3.0%
ã 4
 
2.4%
œ 3
 
1.8%
å 3
 
1.8%
ƒ 3
 
1.8%
Š 2
 
1.2%
½ 2
 
1.2%
É 2
 
1.2%
Other values (18) 22
 
13.3%
Letterlike Symbols
ValueCountFrequency (%)
48
100.0%
Modifier Letters
ValueCountFrequency (%)
ˆ 3
75.0%
˜ 1
 
25.0%
Punctuation
ValueCountFrequency (%)
2
25.0%
2
25.0%
1
12.5%
1
12.5%
1
12.5%
1
12.5%

keywords
Text

MISSING 

Distinct8804
Distinct (%)93.9%
Missing1493
Missing (%)13.7%
Memory size85.0 KiB
2023-10-11T16:32:00.713689image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length131
Median length88
Mean length41.972474
Min length2

Characters and Unicode

Total characters393408
Distinct characters87
Distinct categories17 ?
Distinct scripts2 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8673 ?
Unique (%)92.5%

Sample

1st rowmonster|dna|tyrannosaurus rex|velociraptor|island
2nd rowfuture|chase|post-apocalyptic|dystopia|australia
3rd rowbased on novel|revolution|dystopia|sequel|dystopic future
4th rowandroid|spaceship|jedi|space opera|3d
5th rowcar race|speed|revenge|suspense|car
ValueCountFrequency (%)
of 599
 
2.2%
on 546
 
2.0%
director 357
 
1.3%
film 231
 
0.9%
and 231
 
0.9%
new 203
 
0.8%
in 190
 
0.7%
the 177
 
0.7%
brother 164
 
0.6%
based 163
 
0.6%
Other values (16457) 24006
89.4%
2023-10-11T16:32:01.082814image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 37321
 
9.5%
i 29650
 
7.5%
a 29180
 
7.4%
| 28077
 
7.1%
r 27999
 
7.1%
o 25158
 
6.4%
n 24736
 
6.3%
t 23340
 
5.9%
s 22514
 
5.7%
17513
 
4.5%
Other values (77) 127920
32.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 346243
88.0%
Math Symbol 28078
 
7.1%
Space Separator 17536
 
4.5%
Dash Punctuation 570
 
0.1%
Other Punctuation 423
 
0.1%
Decimal Number 366
 
0.1%
Uppercase Letter 73
 
< 0.1%
Open Punctuation 30
 
< 0.1%
Close Punctuation 28
 
< 0.1%
Other Symbol 21
 
< 0.1%
Other values (7) 40
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 37321
 
10.8%
i 29650
 
8.6%
a 29180
 
8.4%
r 27999
 
8.1%
o 25158
 
7.3%
n 24736
 
7.1%
t 23340
 
6.7%
s 22514
 
6.5%
l 17116
 
4.9%
c 13962
 
4.0%
Other values (27) 95267
27.5%
Other Punctuation
ValueCountFrequency (%)
. 246
58.2%
' 156
36.9%
7
 
1.7%
§ 3
 
0.7%
3
 
0.7%
& 2
 
0.5%
¡ 2
 
0.5%
, 1
 
0.2%
1
 
0.2%
1
 
0.2%
Decimal Number
ValueCountFrequency (%)
1 82
22.4%
0 70
19.1%
9 68
18.6%
7 59
16.1%
3 41
11.2%
2 14
 
3.8%
5 12
 
3.3%
6 12
 
3.3%
8 4
 
1.1%
4 4
 
1.1%
Uppercase Letter
ValueCountFrequency (%)
à 42
57.5%
 23
31.5%
Ÿ 5
 
6.8%
Î 2
 
2.7%
Š 1
 
1.4%
Other Symbol
ValueCountFrequency (%)
© 17
81.0%
¦ 3
 
14.3%
° 1
 
4.8%
Currency Symbol
ValueCountFrequency (%)
¤ 6
46.2%
5
38.5%
¥ 2
 
15.4%
Modifier Symbol
ValueCountFrequency (%)
´ 2
33.3%
˜ 2
33.3%
¸ 2
33.3%
Math Symbol
ValueCountFrequency (%)
| 28077
> 99.9%
¬ 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
17513
99.9%
  23
 
0.1%
Open Punctuation
ValueCountFrequency (%)
( 28
93.3%
2
 
6.7%
Initial Punctuation
ValueCountFrequency (%)
6
85.7%
« 1
 
14.3%
Other Number
ValueCountFrequency (%)
¼ 3
75.0%
³ 1
 
25.0%
Dash Punctuation
ValueCountFrequency (%)
- 570
100.0%
Close Punctuation
ValueCountFrequency (%)
) 28
100.0%
Modifier Letter
ValueCountFrequency (%)
ˆ 4
100.0%
Other Letter
ValueCountFrequency (%)
º 3
100.0%
Final Punctuation
ValueCountFrequency (%)
» 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 346319
88.0%
Common 47089
 
12.0%

Most frequent character per script

Common
ValueCountFrequency (%)
| 28077
59.6%
17513
37.2%
- 570
 
1.2%
. 246
 
0.5%
' 156
 
0.3%
1 82
 
0.2%
0 70
 
0.1%
9 68
 
0.1%
7 59
 
0.1%
3 41
 
0.1%
Other values (34) 207
 
0.4%
Latin
ValueCountFrequency (%)
e 37321
 
10.8%
i 29650
 
8.6%
a 29180
 
8.4%
r 27999
 
8.1%
o 25158
 
7.3%
n 24736
 
7.1%
t 23340
 
6.7%
s 22514
 
6.5%
l 17116
 
4.9%
c 13962
 
4.0%
Other values (33) 95343
27.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 393202
99.9%
None 182
 
< 0.1%
Punctuation 13
 
< 0.1%
Modifier Letters 6
 
< 0.1%
Currency Symbols 5
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 37321
 
9.5%
i 29650
 
7.5%
a 29180
 
7.4%
| 28077
 
7.1%
r 27999
 
7.1%
o 25158
 
6.4%
n 24736
 
6.3%
t 23340
 
5.9%
s 22514
 
5.7%
17513
 
4.5%
Other values (35) 127714
32.5%
None
ValueCountFrequency (%)
à 42
23.1%
 23
12.6%
  23
12.6%
© 17
 
9.3%
å 8
 
4.4%
7
 
3.8%
¤ 6
 
3.3%
Ÿ 5
 
2.7%
â 5
 
2.7%
¦ 3
 
1.6%
Other values (24) 43
23.6%
Punctuation
ValueCountFrequency (%)
6
46.2%
3
23.1%
2
 
15.4%
1
 
7.7%
1
 
7.7%
Currency Symbols
ValueCountFrequency (%)
5
100.0%
Modifier Letters
ValueCountFrequency (%)
ˆ 4
66.7%
˜ 2
33.3%
Distinct10847
Distinct (%)99.9%
Missing4
Missing (%)< 0.1%
Memory size85.0 KiB
2023-10-11T16:32:01.332988image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length1000
Median length738
Mean length307.03756
Min length13

Characters and Unicode

Total characters3335042
Distinct characters144
Distinct categories20 ?
Distinct scripts2 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10843 ?
Unique (%)99.8%

Sample

1st rowTwenty-two years after the events of Jurassic Park, Isla Nublar now features a fully functioning dinosaur theme park, Jurassic World, as originally envisioned by John Hammond.
2nd rowAn apocalyptic story set in the furthest reaches of our planet, in a stark desert landscape where humanity is broken, and most everyone is crazed fighting for the necessities of life. Within this world exist two rebels on the run who just might be able to restore order. There's Max, a man of action and a man of few words, who seeks peace of mind following the loss of his wife and child in the aftermath of the chaos. And Furiosa, a woman of action and a woman who believes her path to survival may be achieved if she can make it across the desert back to her childhood homeland.
3rd rowBeatrice Prior must confront her inner demons and continue her fight against a powerful alliance which threatens to tear her society apart.
4th rowThirty years after defeating the Galactic Empire, Han Solo and his allies face a new threat from the evil Kylo Ren and his army of Stormtroopers.
5th rowDeckard Shaw seeks revenge against Dominic Toretto and his family for his comatose brother.
ValueCountFrequency (%)
the 31655
 
5.6%
a 23240
 
4.1%
to 17549
 
3.1%
and 16956
 
3.0%
of 15597
 
2.7%
in 10160
 
1.8%
his 8607
 
1.5%
is 7865
 
1.4%
with 5514
 
1.0%
her 4793
 
0.8%
Other values (38551) 425509
75.0%
2023-10-11T16:32:01.660858image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
556977
16.7%
e 320395
 
9.6%
t 219691
 
6.6%
a 215612
 
6.5%
i 194998
 
5.8%
o 191213
 
5.7%
n 191104
 
5.7%
s 177760
 
5.3%
r 174774
 
5.2%
h 139782
 
4.2%
Other values (134) 952736
28.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2591110
77.7%
Space Separator 557010
 
16.7%
Uppercase Letter 90606
 
2.7%
Other Punctuation 70522
 
2.1%
Dash Punctuation 9126
 
0.3%
Decimal Number 8887
 
0.3%
Open Punctuation 1981
 
0.1%
Close Punctuation 1977
 
0.1%
Currency Symbol 1893
 
0.1%
Other Symbol 1256
 
< 0.1%
Other values (10) 674
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 10241
 
11.3%
T 7658
 
8.5%
S 6972
 
7.7%
B 6009
 
6.6%
C 5663
 
6.3%
M 5455
 
6.0%
W 4604
 
5.1%
H 4191
 
4.6%
D 4027
 
4.4%
J 3560
 
3.9%
Other values (23) 32226
35.6%
Lowercase Letter
ValueCountFrequency (%)
e 320395
12.4%
t 219691
 
8.5%
a 215612
 
8.3%
i 194998
 
7.5%
o 191213
 
7.4%
n 191104
 
7.4%
s 177760
 
6.9%
r 174774
 
6.7%
h 139782
 
5.4%
l 111503
 
4.3%
Other values (21) 654278
25.3%
Other Punctuation
ValueCountFrequency (%)
, 30277
42.9%
. 27504
39.0%
' 7736
 
11.0%
" 2437
 
3.5%
: 793
 
1.1%
? 575
 
0.8%
; 471
 
0.7%
! 416
 
0.6%
/ 151
 
0.2%
& 89
 
0.1%
Other values (11) 73
 
0.1%
Decimal Number
ValueCountFrequency (%)
1 2014
22.7%
0 1884
21.2%
9 1229
13.8%
2 987
11.1%
5 508
 
5.7%
3 489
 
5.5%
8 477
 
5.4%
7 463
 
5.2%
4 430
 
4.8%
6 406
 
4.6%
Currency Symbol
ValueCountFrequency (%)
1756
92.8%
$ 99
 
5.2%
¢ 15
 
0.8%
£ 12
 
0.6%
¤ 7
 
0.4%
¥ 4
 
0.2%
Modifier Symbol
ValueCountFrequency (%)
˜ 48
47.1%
¨ 30
29.4%
¯ 12
 
11.8%
´ 8
 
7.8%
` 3
 
2.9%
¸ 1
 
1.0%
Other Number
ValueCountFrequency (%)
³ 11
34.4%
¼ 10
31.2%
¹ 4
 
12.5%
² 3
 
9.4%
¾ 2
 
6.2%
½ 2
 
6.2%
Other Symbol
ValueCountFrequency (%)
883
70.3%
© 283
 
22.5%
¦ 71
 
5.7%
® 18
 
1.4%
° 1
 
0.1%
Final Punctuation
ValueCountFrequency (%)
174
95.1%
» 5
 
2.7%
3
 
1.6%
1
 
0.5%
Math Symbol
ValueCountFrequency (%)
± 4
40.0%
¬ 3
30.0%
= 2
20.0%
| 1
 
10.0%
Open Punctuation
ValueCountFrequency (%)
( 1967
99.3%
[ 10
 
0.5%
4
 
0.2%
Initial Punctuation
ValueCountFrequency (%)
295
96.7%
« 7
 
2.3%
3
 
1.0%
Space Separator
ValueCountFrequency (%)
556977
> 99.9%
  33
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 9124
> 99.9%
2
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 1967
99.5%
] 10
 
0.5%
Other Letter
ValueCountFrequency (%)
ª 10
76.9%
º 3
 
23.1%
Control
ValueCountFrequency (%)
13
100.0%
Format
ValueCountFrequency (%)
­ 13
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%
Modifier Letter
ValueCountFrequency (%)
ˆ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2681728
80.4%
Common 653314
 
19.6%

Most frequent character per script

Common
ValueCountFrequency (%)
556977
85.3%
, 30277
 
4.6%
. 27504
 
4.2%
- 9124
 
1.4%
' 7736
 
1.2%
" 2437
 
0.4%
1 2014
 
0.3%
( 1967
 
0.3%
) 1967
 
0.3%
0 1884
 
0.3%
Other values (69) 11427
 
1.7%
Latin
ValueCountFrequency (%)
e 320395
11.9%
t 219691
 
8.2%
a 215612
 
8.0%
i 194998
 
7.3%
o 191213
 
7.1%
n 191104
 
7.1%
s 177760
 
6.6%
r 174774
 
6.5%
h 139782
 
5.2%
l 111503
 
4.2%
Other values (55) 744896
27.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3328794
99.8%
None 3070
 
0.1%
Currency Symbols 1756
 
0.1%
Letterlike Symbols 883
 
< 0.1%
Punctuation 490
 
< 0.1%
Modifier Letters 49
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
556977
16.7%
e 320395
 
9.6%
t 219691
 
6.6%
a 215612
 
6.5%
i 194998
 
5.9%
o 191213
 
5.7%
n 191104
 
5.7%
s 177760
 
5.3%
r 174774
 
5.3%
h 139782
 
4.2%
Other values (77) 946488
28.4%
None
ValueCountFrequency (%)
â 1762
57.4%
à 489
 
15.9%
© 283
 
9.2%
œ 137
 
4.5%
¦ 71
 
2.3%
 49
 
1.6%
  33
 
1.1%
¨ 30
 
1.0%
¡ 23
 
0.7%
® 18
 
0.6%
Other values (32) 175
 
5.7%
Currency Symbols
ValueCountFrequency (%)
1756
100.0%
Letterlike Symbols
ValueCountFrequency (%)
883
100.0%
Punctuation
ValueCountFrequency (%)
295
60.2%
174
35.5%
4
 
0.8%
3
 
0.6%
3
 
0.6%
3
 
0.6%
2
 
0.4%
2
 
0.4%
2
 
0.4%
1
 
0.2%
Modifier Letters
ValueCountFrequency (%)
˜ 48
98.0%
ˆ 1
 
2.0%

runtime
Real number (ℝ)

Distinct247
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean102.07086
Minimum0
Maximum900
Zeros31
Zeros (%)0.3%
Negative0
Negative (%)0.0%
Memory size85.0 KiB
2023-10-11T16:32:01.749903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile75
Q190
median99
Q3111
95-th percentile139
Maximum900
Range900
Interquartile range (IQR)21

Descriptive statistics

Standard deviation31.381405
Coefficient of variation (CV)0.30744724
Kurtosis116.23757
Mean102.07086
Median Absolute Deviation (MAD)10
Skewness6.1037928
Sum1109102
Variance984.79258
MonotonicityNot monotonic
2023-10-11T16:32:01.814820image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
90 547
 
5.0%
95 358
 
3.3%
100 335
 
3.1%
93 328
 
3.0%
97 306
 
2.8%
96 300
 
2.8%
91 297
 
2.7%
94 292
 
2.7%
98 270
 
2.5%
92 270
 
2.5%
Other values (237) 7563
69.6%
ValueCountFrequency (%)
0 31
0.3%
2 5
 
< 0.1%
3 11
 
0.1%
4 17
0.2%
5 17
0.2%
6 22
0.2%
7 17
0.2%
8 9
 
0.1%
9 7
 
0.1%
10 6
 
0.1%
ValueCountFrequency (%)
900 1
< 0.1%
877 1
< 0.1%
705 1
< 0.1%
566 1
< 0.1%
561 1
< 0.1%
550 1
< 0.1%
540 1
< 0.1%
501 1
< 0.1%
500 1
< 0.1%
470 1
< 0.1%

genres
Text

Distinct2039
Distinct (%)18.8%
Missing23
Missing (%)0.2%
Memory size85.0 KiB
2023-10-11T16:32:02.000645image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length51
Median length44
Mean length18.537121
Min length3

Characters and Unicode

Total characters200998
Distinct characters30
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1225 ?
Unique (%)11.3%

Sample

1st rowAction|Adventure|Science Fiction|Thriller
2nd rowAction|Adventure|Science Fiction|Thriller
3rd rowAdventure|Science Fiction|Thriller
4th rowAction|Adventure|Science Fiction|Fantasy
5th rowAction|Crime|Thriller
ValueCountFrequency (%)
comedy 712
 
5.8%
drama 712
 
5.8%
fiction 670
 
5.5%
documentary 312
 
2.5%
drama|romance 289
 
2.4%
comedy|drama 280
 
2.3%
comedy|romance 268
 
2.2%
horror|thriller 259
 
2.1%
horror 253
 
2.1%
comedy|drama|romance 222
 
1.8%
Other values (1900) 8263
67.5%
2023-10-11T16:32:02.283138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
r 20601
 
10.2%
e 17185
 
8.5%
| 16117
 
8.0%
a 15786
 
7.9%
o 14302
 
7.1%
m 14071
 
7.0%
i 14064
 
7.0%
n 11215
 
5.6%
c 8715
 
4.3%
t 8530
 
4.2%
Other values (20) 60412
30.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 154960
77.1%
Uppercase Letter 28524
 
14.2%
Math Symbol 16117
 
8.0%
Space Separator 1397
 
0.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 20601
13.3%
e 17185
11.1%
a 15786
10.2%
o 14302
9.2%
m 14071
9.1%
i 14064
9.1%
n 11215
7.2%
c 8715
 
5.6%
t 8530
 
5.5%
y 8414
 
5.4%
Other values (7) 22077
14.2%
Uppercase Letter
ValueCountFrequency (%)
D 5281
18.5%
C 5148
18.0%
A 4555
16.0%
F 3565
12.5%
T 3075
10.8%
H 1971
 
6.9%
R 1712
 
6.0%
M 1385
 
4.9%
S 1230
 
4.3%
W 435
 
1.5%
Math Symbol
ValueCountFrequency (%)
| 16117
100.0%
Space Separator
ValueCountFrequency (%)
1397
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 183484
91.3%
Common 17514
 
8.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 20601
11.2%
e 17185
 
9.4%
a 15786
 
8.6%
o 14302
 
7.8%
m 14071
 
7.7%
i 14064
 
7.7%
n 11215
 
6.1%
c 8715
 
4.7%
t 8530
 
4.6%
y 8414
 
4.6%
Other values (18) 50601
27.6%
Common
ValueCountFrequency (%)
| 16117
92.0%
1397
 
8.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 200998
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 20601
 
10.2%
e 17185
 
8.5%
| 16117
 
8.0%
a 15786
 
7.9%
o 14302
 
7.1%
m 14071
 
7.0%
i 14064
 
7.0%
n 11215
 
5.6%
c 8715
 
4.3%
t 8530
 
4.2%
Other values (20) 60412
30.1%

production_companies
Text

MISSING 

Distinct7445
Distinct (%)75.7%
Missing1030
Missing (%)9.5%
Memory size85.0 KiB
2023-10-11T16:32:02.610314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length184
Median length128
Mean length45.516165
Min length3

Characters and Unicode

Total characters447697
Distinct characters120
Distinct categories17 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6850 ?
Unique (%)69.6%

Sample

1st rowUniversal Studios|Amblin Entertainment|Legendary Pictures|Fuji Television Network|Dentsu
2nd rowVillage Roadshow Pictures|Kennedy Miller Productions
3rd rowSummit Entertainment|Mandeville Films|Red Wagon Entertainment|NeoReel
4th rowLucasfilm|Truenorth Productions|Bad Robot
5th rowUniversal Pictures|Original Film|Media Rights Capital|Dentsu|One Race Films
ValueCountFrequency (%)
pictures 1880
 
4.3%
productions 1750
 
4.0%
films 1571
 
3.6%
entertainment 1187
 
2.7%
film 1183
 
2.7%
universal 512
 
1.2%
fox 480
 
1.1%
paramount 442
 
1.0%
century 421
 
1.0%
columbia 412
 
0.9%
Other values (12817) 34206
77.7%
2023-10-11T16:32:02.928534image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 35180
 
7.9%
34207
 
7.6%
e 33390
 
7.5%
n 31750
 
7.1%
t 30796
 
6.9%
r 29487
 
6.6%
o 26480
 
5.9%
a 24331
 
5.4%
s 21969
 
4.9%
l 15887
 
3.5%
Other values (110) 164220
36.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 329907
73.7%
Uppercase Letter 63468
 
14.2%
Space Separator 34213
 
7.6%
Math Symbol 13544
 
3.0%
Other Punctuation 2192
 
0.5%
Decimal Number 1645
 
0.4%
Dash Punctuation 849
 
0.2%
Open Punctuation 721
 
0.2%
Close Punctuation 720
 
0.2%
Other Symbol 320
 
0.1%
Other values (7) 118
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 35180
10.7%
e 33390
10.1%
n 31750
9.6%
t 30796
9.3%
r 29487
8.9%
o 26480
8.0%
a 24331
 
7.4%
s 21969
 
6.7%
l 15887
 
4.8%
u 15355
 
4.7%
Other values (21) 65282
19.8%
Uppercase Letter
ValueCountFrequency (%)
P 10513
16.6%
F 7597
12.0%
C 5813
 
9.2%
M 3907
 
6.2%
E 3880
 
6.1%
S 3731
 
5.9%
B 2962
 
4.7%
T 2804
 
4.4%
A 2700
 
4.3%
G 2331
 
3.7%
Other values (20) 17230
27.1%
Other Punctuation
ValueCountFrequency (%)
. 1411
64.4%
/ 273
 
12.5%
& 214
 
9.8%
, 119
 
5.4%
' 88
 
4.0%
20
 
0.9%
¡ 18
 
0.8%
11
 
0.5%
! 11
 
0.5%
§ 7
 
0.3%
Other values (10) 20
 
0.9%
Decimal Number
ValueCountFrequency (%)
2 418
25.4%
0 401
24.4%
1 204
12.4%
4 170
10.3%
3 145
 
8.8%
9 83
 
5.0%
6 69
 
4.2%
7 63
 
3.8%
8 49
 
3.0%
5 43
 
2.6%
Modifier Symbol
ValueCountFrequency (%)
¨ 8
50.0%
´ 3
 
18.8%
¯ 3
 
18.8%
¸ 1
 
6.2%
˜ 1
 
6.2%
Math Symbol
ValueCountFrequency (%)
| 13391
98.9%
+ 132
 
1.0%
± 21
 
0.2%
Open Punctuation
ValueCountFrequency (%)
( 719
99.7%
1
 
0.1%
[ 1
 
0.1%
Currency Symbol
ValueCountFrequency (%)
¤ 20
87.0%
¢ 2
 
8.7%
¥ 1
 
4.3%
Space Separator
ValueCountFrequency (%)
34207
> 99.9%
  6
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 848
99.9%
1
 
0.1%
Close Punctuation
ValueCountFrequency (%)
) 719
99.9%
] 1
 
0.1%
Other Symbol
ValueCountFrequency (%)
© 314
98.1%
° 6
 
1.9%
Other Number
ValueCountFrequency (%)
³ 48
90.6%
¼ 5
 
9.4%
Other Letter
ValueCountFrequency (%)
ª 3
60.0%
º 2
40.0%
Format
ValueCountFrequency (%)
­ 18
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 393377
87.9%
Common 54320
 
12.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 35180
 
8.9%
e 33390
 
8.5%
n 31750
 
8.1%
t 30796
 
7.8%
r 29487
 
7.5%
o 26480
 
6.7%
a 24331
 
6.2%
s 21969
 
5.6%
l 15887
 
4.0%
u 15355
 
3.9%
Other values (52) 128752
32.7%
Common
ValueCountFrequency (%)
34207
63.0%
| 13391
 
24.7%
. 1411
 
2.6%
- 848
 
1.6%
) 719
 
1.3%
( 719
 
1.3%
2 418
 
0.8%
0 401
 
0.7%
© 314
 
0.6%
/ 273
 
0.5%
Other values (48) 1619
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 446644
99.8%
None 1027
 
0.2%
Punctuation 25
 
< 0.1%
Modifier Letters 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 35180
 
7.9%
34207
 
7.7%
e 33390
 
7.5%
n 31750
 
7.1%
t 30796
 
6.9%
r 29487
 
6.6%
o 26480
 
5.9%
a 24331
 
5.4%
s 21969
 
4.9%
l 15887
 
3.6%
Other values (75) 163167
36.5%
None
ValueCountFrequency (%)
à 514
50.0%
© 314
30.6%
³ 48
 
4.7%
± 21
 
2.0%
¤ 20
 
1.9%
¡ 18
 
1.8%
­ 18
 
1.8%
11
 
1.1%
¨ 8
 
0.8%
§ 7
 
0.7%
Other values (18) 48
 
4.7%
Punctuation
ValueCountFrequency (%)
20
80.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
Modifier Letters
ValueCountFrequency (%)
˜ 1
100.0%
Distinct5909
Distinct (%)54.4%
Missing0
Missing (%)0.0%
Memory size85.0 KiB
Minimum1973-01-01 00:00:00
Maximum2072-12-19 00:00:00
2023-10-11T16:32:03.014081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:32:03.082388image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

vote_count
Real number (ℝ)

HIGH CORRELATION 

Distinct1289
Distinct (%)11.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean217.38975
Minimum10
Maximum9767
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size85.0 KiB
2023-10-11T16:32:03.151547image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile11
Q117
median38
Q3145.75
95-th percentile1025.75
Maximum9767
Range9757
Interquartile range (IQR)128.75

Descriptive statistics

Standard deviation575.61906
Coefficient of variation (CV)2.6478666
Kurtosis53.360979
Mean217.38975
Median Absolute Deviation (MAD)26
Skewness6.1773058
Sum2362157
Variance331337.3
MonotonicityNot monotonic
2023-10-11T16:32:03.219275image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10 501
 
4.6%
11 474
 
4.4%
12 422
 
3.9%
13 377
 
3.5%
14 323
 
3.0%
15 300
 
2.8%
16 270
 
2.5%
17 256
 
2.4%
18 218
 
2.0%
19 189
 
1.7%
Other values (1279) 7536
69.4%
ValueCountFrequency (%)
10 501
4.6%
11 474
4.4%
12 422
3.9%
13 377
3.5%
14 323
3.0%
15 300
2.8%
16 270
2.5%
17 256
2.4%
18 218
2.0%
19 189
 
1.7%
ValueCountFrequency (%)
9767 1
< 0.1%
8903 1
< 0.1%
8458 1
< 0.1%
8432 1
< 0.1%
7375 1
< 0.1%
7080 1
< 0.1%
6882 1
< 0.1%
6723 1
< 0.1%
6498 1
< 0.1%
6417 1
< 0.1%

vote_average
Real number (ℝ)

Distinct72
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.9749218
Minimum1.5
Maximum9.2
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size85.0 KiB
2023-10-11T16:32:03.283922image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.5
5-th percentile4.4
Q15.4
median6
Q36.6
95-th percentile7.4
Maximum9.2
Range7.7
Interquartile range (IQR)1.2

Descriptive statistics

Standard deviation0.93514182
Coefficient of variation (CV)0.15651114
Kurtosis0.54350325
Mean5.9749218
Median Absolute Deviation (MAD)0.6
Skewness-0.43590798
Sum64923.5
Variance0.87449021
MonotonicityNot monotonic
2023-10-11T16:32:03.353128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6.1 496
 
4.6%
6 495
 
4.6%
5.8 486
 
4.5%
5.9 473
 
4.4%
6.2 464
 
4.3%
6.3 461
 
4.2%
6.5 457
 
4.2%
6.4 446
 
4.1%
5.7 415
 
3.8%
6.6 413
 
3.8%
Other values (62) 6260
57.6%
ValueCountFrequency (%)
1.5 2
 
< 0.1%
2 1
 
< 0.1%
2.1 3
< 0.1%
2.2 3
< 0.1%
2.3 2
 
< 0.1%
2.4 7
0.1%
2.5 2
 
< 0.1%
2.6 3
< 0.1%
2.7 3
< 0.1%
2.8 7
0.1%
ValueCountFrequency (%)
9.2 1
 
< 0.1%
8.9 1
 
< 0.1%
8.8 2
 
< 0.1%
8.7 1
 
< 0.1%
8.6 1
 
< 0.1%
8.5 6
 
0.1%
8.4 10
0.1%
8.3 10
0.1%
8.2 6
 
0.1%
8.1 16
0.1%

release_year
Real number (ℝ)

HIGH CORRELATION 

Distinct56
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2001.3227
Minimum1960
Maximum2015
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size85.0 KiB
2023-10-11T16:32:03.419786image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1960
5-th percentile1973
Q11995
median2006
Q32011
95-th percentile2015
Maximum2015
Range55
Interquartile range (IQR)16

Descriptive statistics

Standard deviation12.812941
Coefficient of variation (CV)0.0064022363
Kurtosis0.80005132
Mean2001.3227
Median Absolute Deviation (MAD)7
Skewness-1.2042543
Sum21746372
Variance164.17145
MonotonicityNot monotonic
2023-10-11T16:32:03.486440image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2014 700
 
6.4%
2013 659
 
6.1%
2015 629
 
5.8%
2012 588
 
5.4%
2011 540
 
5.0%
2009 533
 
4.9%
2008 496
 
4.6%
2010 490
 
4.5%
2007 438
 
4.0%
2006 408
 
3.8%
Other values (46) 5385
49.6%
ValueCountFrequency (%)
1960 32
0.3%
1961 31
0.3%
1962 32
0.3%
1963 34
0.3%
1964 42
0.4%
1965 35
0.3%
1966 46
0.4%
1967 40
0.4%
1968 39
0.4%
1969 31
0.3%
ValueCountFrequency (%)
2015 629
5.8%
2014 700
6.4%
2013 659
6.1%
2012 588
5.4%
2011 540
5.0%
2010 490
4.5%
2009 533
4.9%
2008 496
4.6%
2007 438
4.0%
2006 408
3.8%

budget_adj
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct2614
Distinct (%)24.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17551040
Minimum0
Maximum4.25 × 108
Zeros5696
Zeros (%)52.4%
Negative0
Negative (%)0.0%
Memory size85.0 KiB
2023-10-11T16:32:03.553825image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q320853251
95-th percentile89375138
Maximum4.25 × 108
Range4.25 × 108
Interquartile range (IQR)20853251

Descriptive statistics

Standard deviation34306156
Coefficient of variation (CV)1.9546509
Kurtosis13.036952
Mean17551040
Median Absolute Deviation (MAD)0
Skewness3.1149199
Sum1.907096 × 1011
Variance1.1769123 × 1015
MonotonicityNot monotonic
2023-10-11T16:32:03.621791image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 5696
52.4%
10164004.34 17
 
0.2%
21033371.65 17
 
0.2%
20000000 16
 
0.1%
4605455.254 15
 
0.1%
24234951.06 14
 
0.1%
33496898.69 14
 
0.1%
40656017.36 13
 
0.1%
20328008.68 13
 
0.1%
26291714.57 13
 
0.1%
Other values (2604) 5038
46.4%
ValueCountFrequency (%)
0 5696
52.4%
0.9210910508 1
 
< 0.1%
0.9693980426 1
 
< 0.1%
1.012786634 1
 
< 0.1%
1.309052847 1
 
< 0.1%
2.908194128 1
 
< 0.1%
3 1
 
< 0.1%
4.519284805 1
 
< 0.1%
4.605455254 1
 
< 0.1%
5.006695621 1
 
< 0.1%
ValueCountFrequency (%)
425000000 1
< 0.1%
368371256.2 1
< 0.1%
315500574.8 1
< 0.1%
292050672.7 1
< 0.1%
271692064.2 1
< 0.1%
271330494.3 1
< 0.1%
260000000 1
< 0.1%
257599886.7 1
< 0.1%
254100108.5 1
< 0.1%
250419201.7 1
< 0.1%

revenue_adj
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct4840
Distinct (%)44.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean51364363
Minimum0
Maximum2.8271238 × 109
Zeros6016
Zeros (%)55.4%
Negative0
Negative (%)0.0%
Memory size85.0 KiB
2023-10-11T16:32:03.685050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q333697096
95-th percentile2.7655444 × 108
Maximum2.8271238 × 109
Range2.8271238 × 109
Interquartile range (IQR)33697096

Descriptive statistics

Standard deviation1.4463249 × 108
Coefficient of variation (CV)2.8158138
Kurtosis63.379908
Mean51364363
Median Absolute Deviation (MAD)0
Skewness6.2512021
Sum5.5812517 × 1011
Variance2.0918556 × 1016
MonotonicityNot monotonic
2023-10-11T16:32:03.750531image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 6016
55.4%
117753430.8 2
 
< 0.1%
14389144.83 2
 
< 0.1%
1000000 2
 
< 0.1%
29106404.28 2
 
< 0.1%
31721459 2
 
< 0.1%
967000 2
 
< 0.1%
81036423.44 2
 
< 0.1%
57667591.03 2
 
< 0.1%
26331569.65 2
 
< 0.1%
Other values (4830) 4832
44.5%
ValueCountFrequency (%)
0 6016
55.4%
2.37070529 1
 
< 0.1%
2.861933734 1
 
< 0.1%
3.038359901 1
 
< 0.1%
5.926763224 1
 
< 0.1%
6.951083695 1
 
< 0.1%
8.585801203 1
 
< 0.1%
9.05681977 1
 
< 0.1%
9.115079704 1
 
< 0.1%
10 1
 
< 0.1%
ValueCountFrequency (%)
2827123750 1
< 0.1%
2789712242 1
< 0.1%
2506405735 1
< 0.1%
2167324901 1
< 0.1%
1907005842 1
< 0.1%
1902723130 1
< 0.1%
1791694309 1
< 0.1%
1583049536 1
< 0.1%
1574814740 1
< 0.1%
1443191435 1
< 0.1%

Interactions

2023-10-11T16:31:55.404185image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:50.986235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:51.446894image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:51.921450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:52.382585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:52.939550image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:53.397570image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:53.870516image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:54.361190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:54.839558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:55.448506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:51.030374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:51.492221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:51.965015image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:52.428013image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:52.982160image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:53.442725image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:53.919335image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:54.406238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:54.883521image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:55.502104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:51.077780image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:51.540125image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:52.012011image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:52.477907image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:53.028952image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:53.490453image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:53.970913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:54.456552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:54.930877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:55.551613image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:51.123523image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:51.586371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:52.055014image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:52.610593image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:53.074950image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:53.536907image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:54.019523image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:54.502495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:54.975325image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:55.599150image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:51.169836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:51.633627image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:52.102279image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:52.656416image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:53.120978image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:53.583958image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:54.068138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:54.551995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:55.022774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:55.643683image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:51.213772image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:51.680482image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:52.145250image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:52.700726image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:53.163196image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:53.630405image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:54.115395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:54.597305image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:55.166205image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:55.690297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:51.258434image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:51.726708image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:52.191519image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:52.747974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:53.209362image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:53.675569image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:54.163981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:54.645260image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:55.212401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:55.743133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:51.308934image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:51.778897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:52.242095image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:52.798663image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:53.259706image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:53.725700image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:54.214316image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:54.695643image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:55.263385image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:55.791018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:51.354957image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:51.827999image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:52.290030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:52.846061image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:53.305917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:53.773839image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:54.263951image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:54.744733image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:55.313400image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:55.838023image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:51.400379image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:51.874540image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:52.333513image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:52.891009image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:53.351226image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:53.821313image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:54.312646image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:54.789567image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-10-11T16:31:55.356672image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-10-11T16:32:03.799253image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
idpopularitybudgetrevenueruntimevote_countvote_averagerelease_yearbudget_adjrevenue_adj
id1.000-0.288-0.330-0.359-0.247-0.275-0.1310.691-0.360-0.376
popularity-0.2881.0000.5680.6190.2450.7800.1520.0280.5620.610
budget-0.3300.5681.0000.7080.3130.6110.066-0.0330.9940.695
revenue-0.3590.6190.7081.0000.3300.6820.186-0.0880.7100.997
runtime-0.2470.2450.3130.3301.0000.2470.249-0.1800.3240.334
vote_count-0.2750.7800.6110.6820.2471.0000.2710.0940.6040.671
vote_average-0.1310.1520.0660.1860.2490.2711.000-0.0980.0790.193
release_year0.6910.028-0.033-0.088-0.1800.094-0.0981.000-0.084-0.123
budget_adj-0.3600.5620.9940.7100.3240.6040.079-0.0841.0000.704
revenue_adj-0.3760.6100.6950.9970.3340.6710.193-0.1230.7041.000

Missing values

2023-10-11T16:31:55.922734image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-10-11T16:31:56.060508image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-10-11T16:31:56.180876image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

idimdb_idpopularitybudgetrevenueoriginal_titlecasthomepagedirectortaglinekeywordsoverviewruntimegenresproduction_companiesrelease_datevote_countvote_averagerelease_yearbudget_adjrevenue_adj
0135397tt036961032.9857631500000001513528810Jurassic WorldChris Pratt|Bryce Dallas Howard|Irrfan Khan|Vincent D'Onofrio|Nick Robinsonhttp://www.jurassicworld.com/Colin TrevorrowThe park is open.monster|dna|tyrannosaurus rex|velociraptor|islandTwenty-two years after the events of Jurassic Park, Isla Nublar now features a fully functioning dinosaur theme park, Jurassic World, as originally envisioned by John Hammond.124Action|Adventure|Science Fiction|ThrillerUniversal Studios|Amblin Entertainment|Legendary Pictures|Fuji Television Network|Dentsu6/9/1555626.520151.379999e+081.392446e+09
176341tt139219028.419936150000000378436354Mad Max: Fury RoadTom Hardy|Charlize Theron|Hugh Keays-Byrne|Nicholas Hoult|Josh Helmanhttp://www.madmaxmovie.com/George MillerWhat a Lovely Day.future|chase|post-apocalyptic|dystopia|australiaAn apocalyptic story set in the furthest reaches of our planet, in a stark desert landscape where humanity is broken, and most everyone is crazed fighting for the necessities of life. Within this world exist two rebels on the run who just might be able to restore order. There's Max, a man of action and a man of few words, who seeks peace of mind following the loss of his wife and child in the aftermath of the chaos. And Furiosa, a woman of action and a woman who believes her path to survival may be achieved if she can make it across the desert back to her childhood homeland.120Action|Adventure|Science Fiction|ThrillerVillage Roadshow Pictures|Kennedy Miller Productions5/13/1561857.120151.379999e+083.481613e+08
2262500tt290844613.112507110000000295238201InsurgentShailene Woodley|Theo James|Kate Winslet|Ansel Elgort|Miles Tellerhttp://www.thedivergentseries.movie/#insurgentRobert SchwentkeOne Choice Can Destroy Youbased on novel|revolution|dystopia|sequel|dystopic futureBeatrice Prior must confront her inner demons and continue her fight against a powerful alliance which threatens to tear her society apart.119Adventure|Science Fiction|ThrillerSummit Entertainment|Mandeville Films|Red Wagon Entertainment|NeoReel3/18/1524806.320151.012000e+082.716190e+08
3140607tt248849611.1731042000000002068178225Star Wars: The Force AwakensHarrison Ford|Mark Hamill|Carrie Fisher|Adam Driver|Daisy Ridleyhttp://www.starwars.com/films/star-wars-episode-viiJ.J. AbramsEvery generation has a story.android|spaceship|jedi|space opera|3dThirty years after defeating the Galactic Empire, Han Solo and his allies face a new threat from the evil Kylo Ren and his army of Stormtroopers.136Action|Adventure|Science Fiction|FantasyLucasfilm|Truenorth Productions|Bad Robot12/15/1552927.520151.839999e+081.902723e+09
4168259tt28208529.3350141900000001506249360Furious 7Vin Diesel|Paul Walker|Jason Statham|Michelle Rodriguez|Dwayne Johnsonhttp://www.furious7.com/James WanVengeance Hits Homecar race|speed|revenge|suspense|carDeckard Shaw seeks revenge against Dominic Toretto and his family for his comatose brother.137Action|Crime|ThrillerUniversal Pictures|Original Film|Media Rights Capital|Dentsu|One Race Films4/1/1529477.320151.747999e+081.385749e+09
5281957tt16632029.110700135000000532950503The RevenantLeonardo DiCaprio|Tom Hardy|Will Poulter|Domhnall Gleeson|Paul Andersonhttp://www.foxmovies.com/movies/the-revenantAlejandro González Iñárritu(n. One who has returned, as if from the dead.)father-son relationship|rape|based on novel|mountains|winterIn the 1820s, a frontiersman, Hugh Glass, sets out on a path of vengeance against those who left him for dead after a bear mauling.156Western|Drama|Adventure|ThrillerRegency Enterprises|Appian Way|CatchPlay|Anonymous Content|New Regency Pictures12/25/1539297.220151.241999e+084.903142e+08
687101tt13401388.654359155000000440603537Terminator GenisysArnold Schwarzenegger|Jason Clarke|Emilia Clarke|Jai Courtney|J.K. Simmonshttp://www.terminatormovie.com/Alan TaylorReset the futuresaving the world|artificial intelligence|cyborg|killer robot|futureThe year is 2029. John Connor, leader of the resistance continues the war against the machines. At the Los Angeles offensive, John's fears of the unknown future begin to emerge when TECOM spies reveal a new plot by SkyNet that will attack him from both fronts; past and future, and will ultimately change warfare forever.125Science Fiction|Action|Thriller|AdventureParamount Pictures|Skydance Productions6/23/1525985.820151.425999e+084.053551e+08
7286217tt36593887.667400108000000595380321The MartianMatt Damon|Jessica Chastain|Kristen Wiig|Jeff Daniels|Michael Peñahttp://www.foxmovies.com/movies/the-martianRidley ScottBring Him Homebased on novel|mars|nasa|isolation|botanistDuring a manned mission to Mars, Astronaut Mark Watney is presumed dead after a fierce storm and left behind by his crew. But Watney has survived and finds himself stranded and alone on the hostile planet. With only meager supplies, he must draw upon his ingenuity, wit and spirit to subsist and find a way to signal to Earth that he is alive.141Drama|Adventure|Science FictionTwentieth Century Fox Film Corporation|Scott Free Productions|Mid Atlantic Films|International Traders|TSG Entertainment9/30/1545727.620159.935996e+075.477497e+08
8211672tt22936407.404165740000001156730962MinionsSandra Bullock|Jon Hamm|Michael Keaton|Allison Janney|Steve Cooganhttp://www.minionsmovie.com/Kyle Balda|Pierre CoffinBefore Gru, they had a history of bad bossesassistant|aftercreditsstinger|duringcreditsstinger|evil mastermind|minionsMinions Stuart, Kevin and Bob are recruited by Scarlet Overkill, a super-villain who, alongside her inventor husband Herb, hatches a plot to take over the world.91Family|Animation|Adventure|ComedyUniversal Pictures|Illumination Entertainment6/17/1528936.520156.807997e+071.064192e+09
9150540tt20966736.326804175000000853708609Inside OutAmy Poehler|Phyllis Smith|Richard Kind|Bill Hader|Lewis Blackhttp://movies.disney.com/inside-outPete DocterMeet the little voices inside your head.dream|cartoon|imaginary friend|animation|kidGrowing up can be a bumpy road, and it's no exception for Riley, who is uprooted from her Midwest life when her father starts a new job in San Francisco. Like all of us, Riley is guided by her emotions - Joy, Fear, Anger, Disgust and Sadness. The emotions live in Headquarters, the control center inside Riley's mind, where they help advise her through everyday life. As Riley and her emotions struggle to adjust to a new life in San Francisco, turmoil ensues in Headquarters. Although Joy, Riley's main and most important emotion, tries to keep things positive, the emotions conflict on how best to navigate a new city, house and school.94Comedy|Animation|FamilyWalt Disney Pictures|Pixar Animation Studios|Walt Disney Studios Motion Pictures6/9/1539358.020151.609999e+087.854116e+08
idimdb_idpopularitybudgetrevenueoriginal_titlecasthomepagedirectortaglinekeywordsoverviewruntimegenresproduction_companiesrelease_datevote_countvote_averagerelease_yearbudget_adjrevenue_adj
1085620277tt00611350.14093400The Ugly DachshundDean Jones|Suzanne Pleshette|Charles Ruggles|Kelly Thordsen|Parley BaerNaNNorman TokarA HAPPY HONEYMOON GOES TO THE DOGS!...When a Great Dane disguised as a Dachsie crashes the party!great dane|dachshundThe Garrisons (Dean Jones and Suzanne Pleshette) are the "proud parents" of three adorable dachshund pups -- and one overgrown Great Dane named Brutus, who nevertheless thinks of himself as a dainty dachsie. His identity crisis results in an uproarious series of household crises that reduce the Garrisons' house to shambles -- and viewers to howls of laughter!93Comedy|Drama|FamilyWalt Disney Pictures2/16/66145.719660.0000000.0
108575921tt00607480.13137800Nevada SmithSteve McQueen|Karl Malden|Brian Keith|Arthur Kennedy|Suzanne PleshetteNaNHenry HathawaySome called him savage- and some called him saint... some felt his hate- and one found his love... and three had to die...repayment|revenge|native american|wild west|half breedNevada Smith is the young son of an Indian mother and white father. When his father is killed by three men over gold, Nevada sets out to find them and kill them. The boy is taken in by a gun merchant. The gun merchant shows him how to shoot and to shoot on time and correct.128Action|WesternParamount Pictures|Solar Productions|Embassy Pictures6/10/66105.919660.0000000.0
1085831918tt00609210.31782400The Russians Are Coming, The Russians Are ComingCarl Reiner|Eva Marie Saint|Alan Arkin|Brian Keith|Paul FordNaNNorman JewisonIT'S A PLOT! ...to make the world die laughing!!cold war|russian|new englandWithout hostile intent, a Soviet sub runs aground off New England. Men are sent for a boat, but many villagers go into a tizzy, risking bloodshed.126Comedy|WarThe Mirisch Corporation5/25/66115.519660.0000000.0
1085920620tt00609550.08907200SecondsRock Hudson|Salome Jens|John Randolph|Will Geer|Jeff CoreyNaNJohn FrankenheimerNaNplastic surgery|suspenseA secret organisation offers wealthy people a second chance at life. The customer picks out someone they want to be and the organisation surgically alters the customer to look like the intended person, stages the customer's death, gets rid of the intended person and the customer takes on a new life.100Mystery|Science Fiction|Thriller|DramaGibraltar Productions|Joel Productions|John Frankenheimer Productions Inc.10/5/66226.619660.0000000.0
108605060tt00602140.08703400Carry On Screaming!Kenneth Williams|Jim Dale|Harry H. Corbett|Joan Sims|Charles HawtreyNaNGerald ThomasCarry On Screaming with the Hilarious CARRY ON Gang!!monster|carry on|horror spoofThe sinister Dr Watt has an evil scheme going. He's kidnapping beautiful young women and turning them into mannequins to sell to local stores. Fortunately for Dr Watt, Detective-Sergeant Bung is on the case, and he doesn't have a clue! In this send up of the Hammer Horror movies, there are send-ups of all the horror greats from Frankenstein to Dr Jekyl and Mr Hyde.87ComedyPeter Rogers Productions|Anglo-Amalgamated Film Distributors5/20/66137.019660.0000000.0
1086121tt00603710.08059800The Endless SummerMichael Hynson|Robert August|Lord 'Tally Ho' Blears|Bruce Brown|Chip FitzwaterNaNBruce BrownNaNsurfer|surfboard|surfingThe Endless Summer, by Bruce Brown, is one of the first and most influential surf movies of all times. The film documents American surfers Mike Hynson and Robert August as they travel the world during California’s winter (which back in 1965 was off-season for surfing) in search of the perfect wave and an endless summer.95DocumentaryBruce Brown Films6/15/66117.419660.0000000.0
1086220379tt00604720.06554300Grand PrixJames Garner|Eva Marie Saint|Yves Montand|ToshirÅ Mifune|Brian BedfordNaNJohn FrankenheimerCinerama sweeps YOU into a drama of speed and spectacle!car race|racing|formula 1Grand Prix driver Pete Aron is fired by his team after a crash at Monaco that injures his teammate, Scott Stoddard. While Stoddard struggles to recover, Aron begins to drive for another team, and starts dating Stoddard's wife.176Action|Adventure|DramaCherokee Productions|Joel Productions|Douglas & Lewis Productions12/21/66205.719660.0000000.0
1086339768tt00601610.06514100Beregis AvtomobilyaInnokentiy Smoktunovskiy|Oleg Efremov|Georgi Zhzhyonov|Olga Aroseva|Lyubov DobrzhanskayaNaNEldar RyazanovNaNcar|trolley|stealing carAn insurance agent who moonlights as a carthief steals cars various crooks and never from the common people. He sells the stolen cars and gives the money to charity. His best friend, a cop, is assigned to bring in this modern robin hood.94Mystery|ComedyMosfilm1/1/66116.519660.0000000.0
1086421449tt00611770.06431700What's Up, Tiger Lily?Tatsuya Mihashi|Akiko Wakabayashi|Mie Hama|John Sebastian|Tadao NakamaruNaNWoody AllenWOODY ALLEN STRIKES BACK!spoofIn comic Woody Allen's film debut, he took the Japanese action film "International Secret Police: Key of Keys" and re-dubbed it, changing the plot to make it revolve around a secret egg salad recipe.80Action|ComedyBenedict Pictures Corp.11/2/66225.419660.0000000.0
1086522293tt00606660.035919190000Manos: The Hands of FateHarold P. Warren|Tom Neyman|John Reynolds|Diane Mahree|Stephanie NielsonNaNHarold P. WarrenIt's Shocking! It's Beyond Your Imagination!fire|gun|drive|sacrifice|flashlightA family gets lost on the road and stumbles upon a hidden, underground, devil-worshiping cult led by the fearsome Master and his servant Torgo.74HorrorNorm-Iris11/15/66151.51966127642.2791540.0

Duplicate rows

Most frequently occurring

idimdb_idpopularitybudgetrevenueoriginal_titlecasthomepagedirectortaglinekeywordsoverviewruntimegenresproduction_companiesrelease_datevote_countvote_averagerelease_yearbudget_adjrevenue_adj# duplicates
042194tt04119510.5964330000000967000TEKKENJon Foo|Kelly Overton|Cary-Hiroyuki Tagawa|Ian Anthony Dale|Luke GossNaNDwight H. LittleSurvival is no gamemartial arts|dystopia|based on video game|martial arts tournamentIn the year of 2039, after World Wars destroy much of the civilization as we know it, territories are no longer run by governments, but by corporations; the mightiest of which is the Mishima Zaibatsu. In order to placate the seething masses of this dystopia, Mishima sponsors Tekken, a tournament in which fighters battle until only one is left standing.92Crime|Drama|Action|Thriller|Science FictionNamco|Light Song Films3/20/101105.0201030000000.0967000.02